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DETAILED ACTION 

Application Background 

1. This action is responsive to the Request for Continued Examination, filed 
on 12/13/2006. 

2. Applicant has cancelled claims 15, 34 and 72, and amended claims 1,3, 
16, 17, 20, 31, 68, 74 and 75. Claims 36-67 were previously canceled. 

3. Claims 1-14, 16-33, 35, 68-71 and 73-75 are pending in the case, claims 
1, 31 , 68 and 74 are independent claims. 

4. A request for continued examination filed under 37 CFR 1.114, including 
the fee set forth in 37 CFR 1.17(e), was filed in this application after a final . 
rejection. Since this application is eligible for continued examination under 37 
CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the 
finality of the previous Office Action (dated 9/21/2006) has been withdrawn 
pursuant to 37 CFR 1.114. 

5. The examiner's rejection of claims 68-75, made under 35 USC 101, as 
described in the Claims Rejections - 35 USC 101 section of the previous 
office action (dated 9/21/2006) is withdrawn in view of the claim amendments. 

6. The examiner's rejection of claims 15, 34 and 72, made under 35 USC 
102, as described in the Claims Rejections - 35 USC 102 section of the 
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previous office action (dated 9/21/2006) is withdrawn in view of the canceled 
claim. 

Claim Rejections - 35 USC § 102 

7. The following is a quotation of the appropriate paragraphs of 35 
U.S.C. 102 that form the basis for the rejections under this section made in 
this Office action: 

"A person shall be entitled to a patent unless - 

(b) The invention was patented or described in a printed publication in this 
or a foreign country or in public use or on sale in this country, more than 
one year prior to the date of application for patent in the United States." 

8. Claims 1-14, 16-33, 35, 68-71 and 73-75 are rejected under 35 
U.S.C. 102(b) as being anticipated by Yang et al. "HTML Page Analysis 
Based on Visual Cues" from the 6 th International Conference on Document 
Analysis and Recognition (ICDAR 2001), Seattle, Washington, USA, 
Copyright 2001 (hereinafter Yang). 

9. Regarding independent claim 1, Yang discloses identifying a plurality of 
visual blocks in a document and detecting one or more separators between 
the visual blocks. Yang recites: "records in one category are normally 
organized in ways having a consistent visual layout style. Boundaries 
between different categories are marked apparently with different visual styles 
or separators. As we have said, the basic idea of our approach is to detect 
these visual cues" (page 2, left column, third paragraph). 
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Yang discloses initializing a separator list, analyzing the visual blocks, and 
determining how to treat the separator. Yang recites: "Structured documents 
are constructed in a recursive manner. Starting from simple objects and group 
objects, we divide these elements into initial container objects roughly based 
on blocklevel tags [20]. Then we apply the pattern detection algorithm to 
elements of these initial container objects, and detected patterns are 
converted to list objects. For example, using container object and patterns of 
section 3.3, we can create a new container object as {e1, {{el, e3, e4}, {e5, 
e6, e7, e8}, {e9, e10, e11}, {e12, e13}}} where the underscored element is a 
list object Note that outliers between two list elements are appended as do- 
not-cares" (page 4, section 3.4). 

Yang discloses constructing a content structure for the document. Yang 
recites: In section 3 we introduce our heuristics. After that, we talk about our 
method to detect visual patterns and then to construct document structures 
based on these heuristics" (page 2, left column, second paragraph). Yang 
discloses the content structure identifying, for the different visual blocks, 
different portions of semantic content. Yang recites: In this paper, we 
propose a novel method to extract semantic structures from general HTML 
pages. This method doesn't require a priori knowledge of web pages. It uses 
features derived directly from layout of HTML pages" (page 1, last paragraph 
to page 2, first paragraph). 

10. Regarding dependent claim 2, Yang discloses web pages in Figure 3 on 
page 6. 
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11. Regarding dependent claim 3, Yang discloses a document described by 
a tree structure having a plurality of nodes. Yang's process is directed toward 
HTML documents. HTML documents are inherently processed by computers 
in a well-known process commonly referred to as parsing. Yang discloses 
parsing. Yang recites: "the process to parse HTML documents and extract 
simple objects" (page 2, right column, last paragraph). Parsing is a process 
where elements of a markup language document are placed into a tree 
structure in a relative way (i.e. there is a first or root element, with subsequent 
elements being related to the first as child, and where, child elements can 
further have children). These elements are commonly referred to as nodes. 
Yang discloses identifying a group of candidate nodes, and for each node in 
the group: determining whether the node can be divided, and if the node 
cannot be divided, identifying the node as a visual block. Yang recites: 
"During the process to parse HTML documents and to extract simple objects" 
(page 2, right column, last paragraph), where Yang describes the simple 
object as "None-breakable visual HTML objects" (page 2, left column, last 
paragraph). 

12. Regarding dependent claim 4, Yang discloses setting a degree of 
coherence for the visual block. The specification defines the degree of 
coherence as "a measure of how coherent the visual block is" (page 15, lines 
13-14). Yang recites: "A modifier equals to zero means that two objects are 
distinct or can't be compared 1 (page 2, right column, last paragraph). 
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13. Regarding claims 5-13, Yang discloses dividing nodes into their 
respective child nodes based on criteria related to tags and node properties 
(including colors and sizes) on page 2, the bottom of the left column to the 
bottom of the right column. 

14. Regarding claims 14-30, Yang discloses detecting the one or more 
separators. Yang recites: "Boundaries between different categories are 
marked apparently with different visual styles or separators. As we have said, 
the basic idea of our approach is to detect these visual cues" (page 2, left 
column, third paragraph). 

15. Regarding claims 31-35, the claims are directed toward a computer- 
readable media, for the method of claims 1-30 and are rejected using the 
same rationale. 

16. Regarding independent claims 68-75, the claims are directed toward a 
system, for the method of claims 1-30 and are rejected using the same 
rationale. 

Response to Arguments 

17. Applicant's arguments filed 12/13/2006 have been fully considered but 
they are not persuasive. 

18. Regarding claim 1, applicant argues: "Yang does not disclose every 
element of Applicant's claim 1. For example, Yang does not show or disclose 
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"...wherein detecting the one or more separators comprises initializing a 
separator list that includes one or more possible separators between the 
visual blocks, analyzing, for each of the visual blocks, whether the visual 
block overlaps a separator of the separator list, and if so how the visual block 
overlaps the separator, and determining how to treat the separator based on 
whether the visual block overlaps the separator, and if so how the visual block 
overlaps the separator," as recited in Applicants claim T (page 19, first 
paragraph, of the response filed 12/13/2006). Applicant is directed to the 
rejection of claim 1 as restated above. Yang discloses initializing a separator 
list, analyzing the visual blocks, and determining how to treat the separator. 
Yang recites: "Structured documents are constructed in a recursive manner. 
Starting from simple objects and group objects, we divide these elements into 
initial container objects roughly based on blocklevel tags [20]. Then we apply 
the pattern detection algorithm to elements of these initial container objects, 
and detected patterns are converted to list objects. For example, using 
container object and patterns of section 3.3, we can create a new container 
object as {e1, {{e2, e3, e4j, {e5, e6, e7, e8}, {e9, e10, e11}, {e12, e13}}} 
where the underscored element is a list object Note that outliers between two 
list elements are appended as do-not-cares" (page 4, section 3.4). 
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Conclusion 



19. Any inquiry concerning this communication or earlier communications from 
the examiner should be directed to Gregory J. Vaughn whose telephone 
number is (571) 272-4131. The examiner can normally be reached Monday to 
Friday from 8:00 am to 5:00 pm. 

If attempts to reach the examiner by telephone are unsuccessful, the 
examiner's supervisor, Stephen S. Hong can be reached at (571) 272-4124. 
The fax phone number for the organization where this application or 
proceeding is assigned is (571) 272-2100. 

Information regarding the status of an application may be obtained from 
the Patent Application Information Retrieval (PAIR) system. Status 
information for published applications may be obtained from either Private 
PAIR or Public PAIR. Status information for unpublished applications is 
available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on 
access to the Private PAIR system, contact the Electronic Business Center 
(EBC) at 866-217-9197 (toll-free). 
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